Search CORE

144 research outputs found

Parallelization of the AVL FIRE Benchmark with SVM-Fortran

Author: Gerndt Michael
Publication venue: Zentralinstitut für Angewandte Mathematik
Publication date: 01/01/1995
Field of study

This article outlines the parallelization of an irregular grid application with SVM-Fortran. It describes the different optimizations and their effectiveness. The parallelization was much simplified by the performance analysis tool OPAL, a source code based tool for requesting and analyzing runtime performance data. Although shared memory parallelization is easier than distributed memory parallelization, understanding and eliminating the overhead from page faults is impossible without such a tool. It relates the page faults to the arrays and to the location in the source code. An area which is not supported by OPAL but where supporting tools are highly desirable, is the performance degradation due to low utilization of the on-chip cache

Juelich Shared Electronic Resources

Distribution of Periscope Analysis Agents on ALTIX 4700

Author: Gerndt Michael
Strohhäcker Sebastian
Publication venue: John von Neumann Institute for Computing
Publication date: 01/01/2007
Field of study

Juelich Shared Electronic Resources

Designing an Adaptive Application-Level Checkpoint Management System for Malleable MPI Applications

Author: Gerndt Michael
John Jophin
Publication venue
Publication date: 08/11/2022
Field of study

Dynamic resource management opens up numerous opportunities in High Performance Computing. It improves the system-level services as well as application performance. Checkpointing can also be deemed as a system-level service and can reap the benefits offered by dynamism. A checkpointing system can have better resource availability by integrating with a malleable resource management system. In addition to fault tolerance, the checkpointing system can cater to the data redistribution demand of malleable applications during resource change. Therefore, we propose iCheck, an adaptive application-level checkpoint management system that can efficiently utilize the system and application level dynamism to provide better checkpointing and data redistribution services to applications.Comment: Third International Symposium on Checkpointing for Supercomputing (SuperCheck-SC22

arXiv.org e-Print Archive

Scalability and Performance Analysis of OpenMP Codes Using the Periscope Toolkit

Author: Benedict Shajulin
Gerndt Michael
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 10/02/2015
Field of study

In this paper, we present two new approaches while rendering necessary extensions to Periscope to perform scalability and performance analysis on OpenMP codes. Periscope is an online-based performance analysis toolkit which consists of a user defined number of analysis agents that automatically search for the performance properties while the application is running. In order to detect the scalability and performance bottlenecks of OpenMP codes using Periscope, a few newly defined performance properties and meta properties are formalized. We manifest our implementation by evaluating NAS OpenMP benchmarks. As shown in our results, our approach identifies the code regions which do not scale well and other performance problems, e.g. load imbalance in NAS parallel benchmarks

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

SVM Support in the Vienna Fortran Compilation System

Author: Brezany Peter
Gerndt Michael
Sipkova Viera
Publication venue: Zentralinstitut für Angewandte Mathematik
Publication date: 01/01/1994
Field of study

Vienna Fortran, a machine-independent language extension to Fortran which allows the user to write programs for distributed-memory systems using global addresses, provides the forall-loop construct for specifying irregular computations that do not cause inter-iteration dependences. Compilers for distributed-memory systems generate code that is based on runtime analysis techniques and is only efficient if, in addition, aggressive compile-time optimizations are applied. Since these optimizations are difficult to perform we propose to generate shared virtual memory code instead that can benefit from appropriate operating system or hardware support. This paper presents the shared virtual memory code generation, compares both approaches and gives first performance results

Juelich Shared Electronic Resources

Zum Stigma des Sozialschmarotzers

Author: Dauskardt Michael
Gerndt Helge
Moser Johannes
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 01/01/1993
Field of study

Open Access LMU

A Comparison of two Parallelization Strategies for TRACE

Author: Gerndt Michael
Neuendorf Olaf
Prümmer Joachim
Vereecken Harry
Publication venue: Zentralinstitut für Angewandte Mathematik
Publication date: 01/01/1994
Field of study

In this report we compare two different methods of parallelization of a finite element code describing water flow in soils. The first method uses Domain Decomposition based on a parallel Schwarz algorithm. The second method uses a Data Partitioning approach pursued in High Performance Fortran (HPF). Experiments with the parallel versions were performed on the Paragon XP/S 10 at KFA

Juelich Shared Electronic Resources